48 research outputs found

    Proximal Bellman mappings for reinforcement learning and their application to robust adaptive filtering

    Full text link
    This paper aims at the algorithmic/theoretical core of reinforcement learning (RL) by introducing the novel class of proximal Bellman mappings. These mappings are defined in reproducing kernel Hilbert spaces (RKHSs), to benefit from the rich approximation properties and inner product of RKHSs, they are shown to belong to the powerful Hilbertian family of (firmly) nonexpansive mappings, regardless of the values of their discount factors, and possess ample degrees of design freedom to even reproduce attributes of the classical Bellman mappings and to pave the way for novel RL designs. An approximate policy-iteration scheme is built on the proposed class of mappings to solve the problem of selecting online, at every time instance, the "optimal" exponent pp in a pp-norm loss to combat outliers in linear adaptive filtering, without training data and any knowledge on the statistical properties of the outliers. Numerical tests on synthetic data showcase the superior performance of the proposed framework over several non-RL and kernel-based RL schemes.Comment: arXiv admin note: text overlap with arXiv:2210.1175

    Online dictionary learning from big data using accelerated stochastic approximation algorithms

    Get PDF
    Applications involving large-scale dictionary learning tasks motivate well online optimization algorithms for generally non-convex and non-smooth problems. In this big data context, the present paper develops an online learning framework by jointly leveraging the stochastic approximation paradigm with first-order acceleration schemes. The generally non-convex objective evaluated online at the resultant iterates enjoys quadratic rate of convergence. The generality of the novel approach is demonstrated in two online learning applications: (i) Online linear regression using the total least-squares approach; and, (ii) a semi-supervised dictionary learning approach to network-wide link load tracking and imputation of real data with missing entries. In both cases, numerical tests highlight the potential of the proposed online framework for big data network analytics

    Trading off communications bandwidth with accuracy in adaptive diffusion networks

    Get PDF
    In this paper, a novel algorithm for bandwidth reduction in adaptive distributed learning is introduced. We deal with diffusion networks, in which the nodes cooperate with each other, by exchanging information, in order to estimate an unknown parameter vector of interest. We seek for solutions in the framework of set theoretic estimation. Moreover, in order to reduce the required bandwidth by the transmitted information, which is dictated by the dimension of the unknown vector, we choose to project and work in a lower dimension Krylov subspace. This provides the benefit of trading off dimensionality with accuracy. Full convergence properties are presented, and experiments, within the system identification task, demonstrate the robustness of the algorithmic technique

    A Sparsity-Aware Adaptive Algorithm for Distributed Learning

    Get PDF
    In this paper, a sparsity-aware adaptive algorithm for distributed learning in diffusion networks is developed. The algorithm follows the set-theoretic estimation rationale. At each time instance and at each node of the network, a closed convex set, known as property set, is constructed based on the received measurements; this defines the region in which the solution is searched for. In this paper, the property sets take the form of hyperslabs. The goal is to find a point that belongs to the intersection of these hyperslabs. To this end, sparsity encouraging variable metric projections onto the hyperslabs have been adopted. Moreover, sparsity is also imposed by employing variable metric projections onto weighted â„“1\ell_1 balls. A combine adapt cooperation strategy is adopted. Under some mild assumptions, the scheme enjoys monotonicity, asymptotic optimality and strong convergence to a point that lies in the consensus subspace. Finally, numerical examples verify the validity of the proposed scheme, compared to other algorithms, which have been developed in the context of sparse adaptive learning
    corecore